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(57) Abstract 



A speech recognition and control system suitable for mobile telephones has, for each of four or five major languages, a preprogrammed 
store containing many variations of a set of telephone operating commands. The user can manually select one of these four or five major 
languages, and the selected preprogrammed language store will be consulted when the user utters a word into the telephone. A . match 
(recognition) prompts execution of the desired telephone function. The user can replace each of the preprogrammed commands with his own 
user-<hosen and -spoken commands to create his own set of commands specific to his own native language/diaject and/or pronunciaUon, 
The user can also add additional user-dependent commands and a personal user-defined telephone directory. 
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Title: Speech Recognition and Control System and Telephone 
TECNICAL FIELD 

The invention relates to speech recognition and control systems, particularly to 
those which can be utilized by a user to control a telephone by spoken commands or 
by a combination of spoken and manual commands. 

BACKGROUND OF THE INVENTION 

An apparatus such as a telephone, which is to be controlled by spoken commands 
must have a system for recognizing the speech received by the microphone. There 
are in essence two different categories of speech recognition systems, speaker inde- 
pendent systems and speaker dependent systems. 

A speaker dependent speech recognition system is alterable to accommodate the 
individual user of the system, for example the owner or owners of a mobile tele- 
phone, operated in response to spoken commands. Speaker dependent speech 
recognition systems can recognize the command words as pronounced by the 
individual user. It is also possible for the individual user to use words in his own 
native language or user-created language to trigger the various functions of the 
apparatus, for example a mobile telephone. To do this, however, it is first necessary 
for the user to train the recognition system by going through a long and cumbersome 
programming routine, in which each command is repeated several times by the user. 
This must be done before the apparatus can be used at all. Such a system does not 
permit anyone else to use the system, without going through the same cumber-some 
initial procedure. 

A speaker independent system is essentially a system which can recognize spoken 
words in the vocabulary regardless of variations in the speaker's speech depending 



WO 00/22609 * PCT7SE99/01833 

2 

on sex, age and accent. A large number of different speakers must be sampled in or- 
der to provide a broad spectrum of system recognizable pronunciations of a parti- 
cular word, A speaker independent speech recognition system has the advantage that 
it can be used immediately without any initial 'training" of the system to recognize 
5 the words in the vocabulary. Ideally, a speaker independent speech recognition 

system would be so broad as to recognize all different possible pronunciations and 
accents as well as having a separate language mode for each different language in 
the world. This ideal system is, however, hardly practical, even if one only tries to 
cover languages spoken in Europe. The broader the recognition base and the more 
10 languages are included, the more laborious, extensive and expensive the creation of 
the speaker independent system will be. And even more work will be involved as 
further functions and modifications are to be included in the speech recognition and 
control system. 

1 5 DESCRIPTION OF RELATED ART 

Many different speech recognition systems of the above types have been developed. 
One such system is described in W096/13827 (PCT/GB95/02563) to Ringland et al. 
This known system is based on the recognition of individual phonemes (subwords) 

20 which are then combined to form commands which control the various functions of 
the apparatus. Instead of recognizing complete words, which theoretically can be 
infinite in number, this known system recognizes phonemes which are the building 
blocks of words and are finite in number. After positively identifying a phoneme, 
a processor combines it with other positively identified adjacent phonemes to create 

25 a word or pose. This system is much more economical as regards storage capacity. 
During use of this system* when a specific phoneme is recognized, by comparison 
with a predefined store of standard phonemes, the actual user utterance of this 
phoneme, with his particular inflection and accent, is stored in another parallel 
memory, thereby improving future recognition of this phoneme when uttered by this 

30 particular speaker. 
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This known system does not, however, allow the user to substitute another language, 
for example if a Swedish user wishes to say "sla" instead of "dial". And also it does 
not permit major deviations from the standard pronunciation of a specific phoneme. 

SUMMARY OF THE INVENTION 

The present invention overcomes these and other shortcomings with a speech 
recognition and control system and a telephone incorporating this system. The 
speech recognition and control system, which is included in the telephone according 
to the invention, is preprogrammed to recognize at least one speaker independent set 
of audible commands. Several different sets of basic speaker independent audible 
commands, one set for each of several major languages such as for example English, 
French, German, Spanish, Japanese etc. could be produced by the manufacturer of 
the system and/or telephones and be preprogrammed into the system. If the system 
is incorporated in a mobile telephone for example, the appropriate language can be 
selected manually via the menu language selection function. Thereafter, the user can 
utter one of the limited number of preprogrammed audible commands to the system 
in the selected major language. 

This does not require any initialization or system programming on the part of the 
user. For each of the words in the preprogrammed set of commands, a broad 
spectrum of different standard pronunciations are recognizable by the system in the 
manually selected major language. 

The user is then able, as he is using the system, to add user-dependent commands, 
perhaps relating to personal telephone directory entries. The system according to the 
invention is also configured so that each of the preprogrammed commands may be 
replaced by a user-specific utterance. This may be the user's dialect pronunciation of 
one of the commands in the selected preprogrammed set of commands or it may be a 
corresponding command in the user's native language which is not one of the major 
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preprogrammed languages. A Swedish user may replace the English command 
"dial" with the Swedish command "sla" so the system will be adaptable to virtually 
any language. 

Furthermore, the replacement feature permits the user to replace the standard 
commands with any code or imaginary commands of the user's choosing. 

The system according to the invention is also provided with a user-actuated function 
to return the system, temporarily or permanently, to the original preprogrammed sets 
of recognizable commands. 

The speech recognition and command system according to the invention thus 
provides the advantages of a user-independent ready-to-use system together with the 
versatility and customization of user-specific and user-defined systems, without the 
disadvantages thereof. 

DESCRIPTION OF THE DRAWING 

The accompanying figure illustrates schematically one embodiment of the speech 
recognition and command system incorporated in this particular example in a 
mobile telephone. 

DETAILED DESCRIPTION 

The figure shows a block diagram of a speech recognition and control system 
incorporated in a telephone in accordance with the present invention. The speech 
recognition unit 2 and the speech control unit 5 are incorporated in a telephone. 
Preprogrammed command stores 3 in several major languages such as English, 
French, German, Spanish, Japanese, etc. are coupled to the speech recognition unit. 
In each of these language stores a set of command words is stored with many 
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different standard variants, to cover a broad spectrum of different pronunciations, 
tonal levels etc., of each command to be recognized. 

The user first selects the major language which he wishes to use initially. This 
can be done, for example, by selecting manually a desired language from a menu 
presented in the telephone display window. In this case the arrow in the box 3 indi- 
cates that French has been selected as the initial major language for operating the 
telephone by voice commands. The user can then utter commands in standard 
French in the telephone transmitter/microphone. The user may say the command 
"compose" (dial) and the speech recognition unit will check with the command store 
for French, which has already been manually selected, to see if the audio signal 
generated by the user saying "compose" matches any of the variants of this com- 
mand stored in the command store for French. If there is a match, then the speech 
recognition unit 2 will send a signal that it has recognized the command for "dial" to 
the speech control unit 5. The speech control unit will in turn send an "execute" 
signal to the operating unit 6 of the telephone to perform the operation "dial". 

A user-programmed command store 4 is arranged in parallel with the preprogram- 
med major language command stores 3. The user may enter utterances in the user- 
programmed command store to replace the commands in the selected major 
language command store. With the telephone in "replace" mode, the user can give 
the command "compose", having chosen French as his initial operating language 
and then give the word for dial in his own native language, saying "sla" if his native 
language is Swedish. He can thus replace all of the commands in the standard set 
of preprogrammed commands with commands in his own particular language, 
dialect or pronunciation. He can even enter secret code words if he wishes. The 
system is also provided with an override function to ignore the commands entered in 
the user-programmed command store and use the selected major language command 
store instead. This will enable another user to be able to use the speech recognition 
and control functions of the telephone. 
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The user can also enter his own set of recognizable additional commands to compile 

his own personal telephone directory for example. Each stored number may be 

coupled to a user-uttered name, and this name can follow the command 

"dial/compose/sla" to dial the number of the desired person. The numbers in 

the personal telephone directory may be entered manually, automatically or by audio 

recognition. 

It is also envisioned that the user will be able to add additional commands to the 
user-programmed command store. For example, dormant functions may be activated 
and be controlled by a user-defined voice command. One simple example would be 
the display of the remaining battery charge by uttering any selected user-defined 
command. The current time in any selected time zone might also be displayed by 
uttering user-defined commands. 
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CLAIMS 

1. A speech recognition and control system comprising: 

a microphone for receiving audible commands spoken by a user and generating 
electrical audio signals, 

a processor including a recognizer for recognizing received electrical audio signal 
patterns and means for generating command signals in response thereto, said 
processor being coupled to said microphone to receive its generated electrical audio 
signals, 

wherein said processor is preprogrammed to recognize at least one speaker 
independent set of audible commands, some or all of said audible commands being 
individually replaceable by the user with user-chosen and -spoken audible 
commands which are user-specific. 

2. Speech recognition and control system as defined in Claim 1, wherein said 
processor is preprogrammed to recognize a plurality of speaker independent sets of 
audible commands in a plurality of different languages. 

3. Speech recognition and control system as defined in Claim 1 , wherein the 
processor is disposed to incorporate additional user-specific spoken audible 
commands entered by the user. 

4. Telephone, comprising a speech recognition and control system as defined in 
Claim 1, 2 or 3, wherein said microphone is a telephone transmitter and said 
processor generates command signals to operate the telephone. 

5. Telephone as defined in Claim 4, wherein a preprogrammed speaker independent 
set of audible commands in a desired language can be employed by manually 
selecting a desired language mode on the telephone. 
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6. Telephone as defined in Claim 4, wherein the telephone is a mobile telephone. 



1. Telephone as defined in Claim 4, wherein the additional user-specific spoken 
audible commands comprise a list of names with related telephone numbers. 
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